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\ INTRODUCTION 



One of the prirnary goals of the Castor Engine V'LDB design was to have a clean abstraction for the 
SQL generation module. The requirement is to have the SQL generation part independent of the 
core engine processing functionality. In addition we also want the capability to change the syntax 
without having to recompile the system. 

Tn this document we talk more about the abstraction and design that Is used to generate the syntax so 
that it can be changed dynamically. 



Example SQL 

A typical SQL statement generated by the SQL engine is shown below. 




The above sample SQL was generated fox SQL Server. The same SQL statement when generated 
for Oracle is shown below. 




The two SQL statements shown above, bom create implicit tables. Notice the difference between 
the implicit tabic creation syntax for the two databases,. Also notice the difference between the join 
syntax for the two databases - SQL server is generating SQL-92 standard syntax whereas Oracle 
needs the 'traditional' SQL syntax. 

Given the following two VLDB syntax generation settings, we can write a syntax generation 
module that can generate the two SQL statements shown above. 

1 . J01N_SYNTAX (with, possible values JOIN92 and JOIN89) 

2, IMPLIC1T_TABLE_SYNTAX (with possible values ORACLE_STYLE and 
SQLSERVER_STVL£) 



Factors Affecting SQL Generation 

Section 1 . 1 gave an introduction on how the engine needs to generate different syntax. We saw how 
certain 'settings* could be used, to control SQL generation. If you take into account the different 
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situations in wliich the SQL Engine generates syntax wo can come up with many factors that affect 
the decision to generate syntax at different points. We now discuss the different factors and give 
examples for each of the factors. 



1 .2*1 VLDB Setting 

As we saw in the example previously, VLDB Settings obviously affect the way syntax gets 
generated One setting wc saw, JOIN_SYNTAX, governs the way the joins are generated. 



1 .2.2 Internal Program Setting 

Another kind of setting that can affect syntax'generation is 'programmatic 1 setting. By this we 
mean an internal setting in memory that comes into play during actual execution of the system The 
original source or deciding factors for this internal setting may be a combination of the given report 
instance and multiple VLDB Settings on different objects involved in the report 

A classic example of this kind of setting is 'outer join* setting. Given a report with one or more 
metrics - the kind of join that needs to be done between different tables is decided by a complex 
algorithm involving the settings on the different metrics and the report and the support provided by a 
particular database type for different kinds of outer joins. In the end, when it actually comes to 
syntax generation all we have is a memory location that indicates two tables have to be *left outer 
joined* or c equi joined 1 etc. 

Now depending on the memory setting we need to generate different syntax. To use the example of 
types of joins ,lctus revisit the SQL gent^atcd in section 1.1 for SQLServer, 




We observe that the tables a j and Cl 2 are 'equi joined'. If the memory settings had dictated that the 
tables be *full outer joined', then the SQL generated is shown below. 




Notice the changes in ti&fiom ^td select clauses. 
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A pattern is a string that describes how a given set of argument strings is manipulated to produce a 
result string. 
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5 ! ! VLDB SYNTAX G^MMAK 

We now introduce the concept of VLDB Syntax Grammar (VSG). VSG is vised to describe the 
syntax that is generated by the SQL Engine. It is a specialized grammar that uses patterns, VLDB 
settings, memory settings and string arguments to describe the syntax generated. 



:■! 1 Example 

Let us consider the example of generating the join syntax, specifically we will consider the SQL 92 
standard syntax. A typical SQL92 join clause is shown below; 

TABLE1 al 
join TABLE2 a2 
. . on (al. STORE JD = a2.ST0REjD) 
and (al. ITEM ID = a2JTEMJD) 

We could use a context free grammar syntax to describe the above construct 

NODESQL - TableName AliasNarne 

NODESQL- NODESQL JOINTVPE NODESQL JOINSQL 

JOINTYPE -join | aross join | left outer join \jull outer join \ right outer join 

JOINSQL - (COLEXPR) 
JOINSQL - JOINSQL ^ JOINSQL 

COLEXPR - COLUMN = COLUMN 

COLUMN - AliasName.Q)lurnnNgirifi 

In the above i»« mn« iT the following variables are resolved to frrmfnatirtg strings that come from 
different basic SQL entities such as table names, column names, alias names etc. 

TabieName - obviously the name of the table being joined 
AliasName - name of the alias given to a particular table being joined 
ColumnName- name of a taking part in a join 

The grammar shown above is intended ro describe language syntax. We need a specialized 
grarrrmar that describes how a language syntax should be generated u ^ n B patterns and settings, 
hence the concept of a V$G- We now introduce the basic concepts of a VSG. 



VSG - Basic Concepts 

We now introduce the different grammar constructs mat make up a VSG. We need to keep in mind 
that a VSG is a language generation grammar and it is very closely tied to syntax generation 
paradigms hi the Castor SQL Engine. 



2.1 Applying a pattern 

One of the constructs used in a VSG is that of applying a pattern to a set of string arguments to 
generate a result string. The two elements used in this construct are the pattern mat is to be applied 
and the string arguments that the pattern applies to. 

The general form of this construct is shown below. 
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PATTERN (string arguments) 

A particular example of this construct is shown below. 

TABPATTERN ^ (tablename, tablealuu) wherc TAB PATTERN * ^ 

The pattern that is applied itself can be chosen based on different settings as we will see in the next 
section. 

2.2 Choosing a pattern 

Another construct used iri a VSG is that of choosing a pattern based on a setting. This setting can be 
a VLDB Setting or it can be amemory setting. The general form of this construct is shown below, 

SETTlNG(PaXtcml ORPattern2 OR ... PattcrnN) 

An example of this construct is that of choosing different patterns for joins depending on an internal 
.memory setting for type of joins. 

JomSettingfREGUIdRTOINPATTERN OR CkOSSJOWPATTERN OR 
LEFTOUTERJOINPA TTERN OR FULlOUTERJOFNPA TTERN) 

23 Other Constructs 

Some of the other constructs used in a VSG are: 

1- vector<XXX> * a shortcut used to denote a vector of strings. 

3 An Example VSG 

We will now look at an example VSG. As in section 3.1, we will consider the case of generating 
' SQL92 standard join syntax. 

NODESQL - TABPATTERN -> {TableNamc, AliasName) 

NODESQL - APPENDPATTERN (JOINSQL ,ONSQL) 

JOINSQL - JOINPATTERN {NODESQL, NODELSQL) 

JOINPATTERN - IntemaUoinSetting (REGXJIARJOINPATmiN OR CROSS JOINPATTERN 
OR LEFTOUTERJOINPATTERN OR RJLTJDUIERJOINPATTERN OR 
RIGHTOUTERJOINPATrBRN) 

ONSQL - ONP ATTERN -> ((vector<COLEXPR>), (vector<COLEXPR>)) 
. COLEXPR - COLPATTERN (AliasName, ColurnnName) 
TABPATTERN -#0 #1 
ONPATTERN-#0 = #1 |#0#<and£*#> 
REGULARJOINPATTERN - #0 join #1 
APPENDPATTERN - #0#< #*#> 
CROSSJOINPATTERN - #0 cross join #1 
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CASTOR SYNTAX GENERATION 

In the previous sections we looked at patterns and how they axe used in VSGs. We now discuss 
bow this all lies together in providing the flexibility and database independence wc need in Castor. 



4 1 VSG ANALYSIS 

As we saw earlier a VSG describes how the Castor SQL Engine should generate syntax. The real 
power and flexibility of the VSG comes from the Diet that it uses patterns to drive syntax generation. 
If these patterns are changed the syntax generated changes accordingly. The patterns themselves axe 
not hard coded inside the SQL Engine. They exist as properties on a DBMS object By changing 
the partem properties on the database object we can dynamically change the syntax generated 
without the need to recompile the engine. 

Thus any part of the SQL syntax that is generated by a rule of the general form pattern 
(string arguments) can be modified dynamically by changing the pattern 

In addition to the effect of changing patterns, we also have the constructs where patterns are selected 
depending on VT-DB settings. Thus when we change a VLDB setting, it causes a different pattern 
to be selected and changes the SQL generation. 

The one construct which does not have a truly dynamic effect is that of choosing patterns based on 
* internal memory settings. The internal memory settings are set by an algorithm based on settings on 
' objects and internal logic and hence can only be indirectly controlled 
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'atterns and VSG Grammars : Create Table 

his document describes the patterns and VSG grammars that were used in the Create Table pan of the 
yntax Abstraction. The purpose of this syntax abstraction is to increase the flexibility of the Castor Engine 
. ith regard to the SQL generation part The flexibility is increased by removing the hard encoding from 
. le code. This is replaced by patterns, which can be changed when necessary without the need to recompile 
. ie code again. As an example, this could eliminate the problems when a database vendor changes the way 
i does joins, 

) General Patterns 



his section gives a few examples of patterns, which are of a general nature and are thus suitable for not 

■ tily the Create Table VSG but also for other VSGs, 

■ PPENDPATTERN #0#<#*#> 

This pattern concatenates all its arguments without any separators. 
I MPTYPATTERN 

This pattern returns an empty string. 
: BPARATORAPPENDPATTERN #1#<#0#*#> 

This pattern needs at least two arguments and uses its first argument as a separator in the 

concatenation of the rest of its arguments. 
PPLYPATTERN #0|#] 

This pattern uses the result of its fust argument as input for its second. 



:; ) The Create Table VSG 

'. Ilia section focuses on the specific VSG for the Create Table syntax. This VSG is based on the 'DDL and 
.. ML syntax' document The VSG in this section only focuses on the generation of the 
:: ^LStatementBlock. The starting symbol of this VSG is CREATETABLEPATTERN. 



< REATETABLEPATTERN 

': ableprestatpattbrn 
■: ablepoststatpattern 

< reatetablepattern. . 
c reatedatapattern 

'I i^mptablespacepattekn 

J ecpidpattern 

'I ablemtostatpattern 

] jsertintopattern 
f ^ecthintpattern 

; SSELECTH1NTPATTERN . 

.) TOPATTERN 

P ::LECTONLYPATT£RN 

ILLICIT 



CrcatoTabIeSetting(EXPLICIT OR IMPLICITORACLE OR 
IMPLICITSQLSERVER OR MPLICITWORMIX) 
SEPAJRATORAPPENDPATTERJ^> (gSPACE, 
TablePreStatementl, TablePreStatement2) 
SEPARATORAPPENDPATTERN-^ (gSPACE, 
TablePostStatement 1 , TablePostStatement2) 
create #0 tabJe #1 #2#3 

(SEPARATORAPPENDPATTERN^ (gSPACE, coin, 
data_type. Constraint) 

APPENDPATTERN-^. (TempTableSpace) 
RICPIDSetting(RI_Constraint OR Partition Jfodex_definition) 
SEPARATORAPPENDPATTERN (gSPACE, 
TableMidStatementt, Tab)eMidStatement2) 
insert into #0#1 

SEPARATORAPPENDPATTEKN — Kjffiffo 

SelectHinlStatcment, FromStalement, Where Statement, 

GroupByStatement, HavingStatement) 

APPENDPATTERN-^ (gAS, gSPACE, 

SELECTHINTPATTERN) 

into#0#l #2*3 

SelectHintStatement 

SEPARATORAPPENDPATTEKN"*' (gEOL, 
— XA£££PR£^A£PA^R*$r CREATETABLEPATTERN, 
CREATEDATAPATTERN, 
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1 1PLICITORACLE 



3 .1PLICITSQLSERVER 



.IPLICITINFOKMIX 



TEMPT ABLESP ACEP ATTEKN , RICPTDP ATTEKN , 
4-^^MlDS-fAT^^ 
.^IZL^THIM^ATT-E^ 
SEPARATORAPPENDPATTERN-^ (gEOL, 
TABLEPRESTATPATTERN, CREATETABLEPATTERN, 
TEMPTABLESPACEPATTERN, APPENDPATTERN(gAS, 
SELBCTHINTPATTERN), TABLEPOSTSTATPA'iTERN) 
SEPARATORAPPENDPATTERN-^ (gEOL, 
TABLEPRESTATPATTERN, SELECTONLYPATTERN, 
INTOPATTERN, TEMPTABLESPACEPATTERN, 
FromStatement, WheTeStatement, GroupByStatement, 
HavingStatement, TABLEPOSTSTATPATTERN) 
SEPARATORAPPENDPATTERN+* (gEOL, 
TABLEPRESTATPATTERN, SELECTHENTPATTERN, 
INTOPATTEKK, TEMPTABLESPACEPATTERN, 
TABLEPOSTSTATPATTERN) 



'! be VSG described above uses, a few constants with the following meaning: 



- ->EOL 

- &SPACE 
• *AS 



End Of Line character 
A single space 
The string 'as' 
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Abstract 

The CreateTable VSG presented in this document is intended to increase the flexibility in the 
generation of Create Table syntax. The patterns described in this document have been replaced by a single 
pattern, which is described in the VSG patterns and Usage document. Please refer to that document 
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1. Introduction 



VSG Grammar presented in this document is based on the code found in 
SyntaxSQL::GenCreateStmt. The next section presents the VSG itself and a small example to 
clarify it. This section gives some general patterns that will be frequentiy used (hopefully). 



The following three operators are used in this document: 

• PATTERN->(string arguments) 

• SETTING (Patteml OR Pattern2 OR . . . PatternN) 

The PATTERN and SETTING-operator are explained in the document 'Zen and the Art of VLDB 
Syntax Generation'. 

Some pattern notations and their meanings: 

• #< begin repeat block 

• #> end repeat block 

• #0 first argument also called explicit variable 

• # 1 second argument also called explicit variable 

• #* remaining arguments also called implicit variable 

• | pipe, use output of left part as input for right 

• Non-Terminal words appearing on the left and possibly on the right 

• Terminal words appearing only on the right 



AppendPat 



: #0#<#*#> 
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2. New VSG arguments and operators 

This section examines the new VSG operators and the interactions between arguments and patterns. 
First, subsection 2. 1 examines the interaction between patterns and the arguments to which they are 
applied. Second, subsection 2.2 introduces two new VSG operators, head and tail. 



2.1 Of Strings, Vectors, and Patterns 

The construction of the columnlines in the CreateTable VSG is the most complex part. This 
complexity is caused by the inclusion of both strings and string- vectors. The patterns that are used to 
define a VSG are input independent. This means, that the meaning of a pattern does not change 
based on the type of its inputs. The presence of a vector in the input can, however, result in an 
implicit loop, which becomes explicit in the code. 

In order to determine if the output of a pattern is a string or a vector of strings the following simple 
rules are used: 

1) If an explicit variable is linked with a vector argument, then the output of the pattern is a vector 
of strings. 

2) If no explicit variables are linked with vector arguments, then the output of the pattern is a 
string. 

In the rules above an explicit variable is indicated by the #2 token, which points to the third 
argument. The only implicit variable is represented by the #* token, which points to the arguments 
that remain after all the explicit variables have been linked. 

For example, the #0#<#*#>#1 pattern has two explicit variables (#0 and #1), and one implicit 
variable (#*). Depending on its arguments this pattern can produce either a string or a vector of 
strings. In the following, some arguments are used to clarify this. 

• Input: vector<LastName>, vector<FirstName>, USA, vector<CityData> 

• Output: vector of strings, with each string contains a LastName followed by 'USA*, the entire 
CityData vector and finally a FirstName. 

• Input: LastName, FirstName, vector<PersonalData> 

• Output: string, which contain LastName followed by the contents of the entire PersonalData 
vector and FirstName at the end. 

• Input: LastName, FirstName, ZipCode, Address 

• Output: string, consisting of LastName, ZipCode, Address, and FirstName 

Note that when a pattern is applied it always needs at least as many arguments as it has explicit 
variables. 

A final note on the difference between vectors that are linked to explicit variables and those linked 
to the implicit variable. When a vector is linked to the implicit variable, it is treated as a number of 
unrelated strings. In this case, it is not possible to see the difference between using that vector as the 
input argument or its string elements as separate input arguments. 

When a vector is linked to an explicit variable, the pattern to which it contains is executed for each 
of the elements of that vector separately. This results in a loop and a vector of string results. A 
consequence of this approach is that when multiple vectors are linked to explicit variables, these 
vectors have to have the same length. If this is not the case, then the pattern can not be applied as it 
has not enough arguments. 



2.2 Of heads.and tails. 
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When a vector of strings has to be used to produce a string of its elements with a certain separator in 
between, then the rules above do not allow it. In order to still follow the simple rules, two additional 
operators on vectors are needed. They are the head and the tail operator. Their names suggests 
exactly what they are meant to do. The head operator takes a non-empty vector as input and 
produces the first string element of that vector. The tail operator also takes a non-empty vector as 
input and removes the first element of the input vector, which results in a vector output. 

For example given a vector of lines the following pattern can be used to separate those lines with 
commas, which results in a string. 

• SeparatorPattern->(head(vector<lines>) ) tail(vector<lines>)) 

• SeparatorPattern : #0#<,#*#> 

On the other hand if no head or tail operator is used, it is not possible to get the desired result 
without bending the rules or inventing new operator. 

• Sep2Pattern^(vector<lines>) 

• Sep2Pattern : #<,#*#> 

In this case, the first comma has to be deleted by a new operator. 

• Sep3Pattern->(vectoKlines>) 

• Sep3Pattern : #0#<,#*#> 

In this case, the result is a vector equal to the input vector, because according to rule the vector gets 
linked to an explicit variable. In order to get the desired result, the rule has to be extended to allow a 
vector to be linked to multiple variables. This could get confusing and is much better handled by the 
introduction of the head and tail operator as described above. 
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3. The Create Table VSG 

This section presents the Create Table VSG. This presentation has been split into four parts. First, 
the patterns dealing with greater chunks of SQL are presented in subsection 3.1. Subsection 3.2 
presents the details of the first line of the CreateTable syntax, the so-called CreateLine. Next, 
subsection 3.3 describes how the syntax needed for columns in CreateTable syntax is generated. 
Finally, subsection 3.4 presents the details of the so-called PostColumnPart. 



3.1 Overview Patterns 

TotalCreateTable : AppendPat -> (CreateLine, ColumnPart, PostColumnPart) 

CreateLine : AppendPat -> (CreatePart, QualPart, TablePart, DescrPart, TTNamePart, 

OptionPart) 

PostColumnPart : AppendPat-> (SpacePart, PostColurrinlndexPart) 

The overview patterns give an overview of the patterns generated to produce a create table 
statement. TotalCreateTable is the starting symbol, which shows the three main parts of the create 
table statement. The most important part is generated by ColurnnPart, which combines the column 
names, types and constraints. CreateLine takes care of the correct syntax before the columns start. 
The PostColumnPart takes care of the syntax after the columns have been enumerated. 



3.2 Detail Patterns of CreateLine 



CreatePart 

QualPart 

TablePart 

DescrPart 

TTNamePart 

OptionPart 



AppendPat^(gCREATE, gSPACE) 
AppendPat^(Qualifier, gSPACE) 
AppendPat^(gTABLE, gSPACE) 
AppendPat -^(Descriptor, gSPACE) 
AppendPat^(TempTableName, gSPACE) 
AppendPat^(Option, gSPACE) 



The CreateLine consists of a number of parts, which are either always present (such as CreatePart, 
TablePart, and TTNamePart) or which might be absent (such as QualPart, DescrPart, and 
OptionPart). An optimization in the case of possible absent parts is to only include them when they 
contain something useful. This can be realized by a test in the code. The advantage of this approach 
is that no extra (harmless) characters clutter the formatting of the SQL syntax. 



3.3 Details of ColumnPart 

ColumnLine : VecAppendPat->(gTABKEY, (vectoKColName>), (vector<CoiType>), 

(vector<ColConstraint>)) 
VecAppendPat : #0#1#0#2#0#3 

ColumnPart : ControlInsertPat->(Control, head(vector<ColumnLine>), 

tail(vectoi<ColumnLine>)) 
ControlInsertPat : (#0#1#<,#0#*#>) 



The syntax of the columns are generated by ColumnLine with the help of the VecAppendPat 
pattern, which results in a vector of ColumnLines. These ColumnLines are subsequently used in the 
generation of the ColumnPart. This adds the control to each line, the parentheses at the beginning 
and end, and the comma's between the ColumnLines. The inclusion of the commas between the 
ColumnLines is the cause of the use of the head and tail operator. 
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3.4 Detail Patterns of PostColumnPart 

The TotalCreateTab always appends the SpacePart and PostCoIumnPartition to the end of the 
Create Table syntax. Both these parts are optional and the remarks regarding optional syntax in 
subsection 3.2 also apply in this case. They can also be used to include database specific syntax. 

SpacePart : AppendPat^ (gSPACE, Space, gSPACE, TableSpaceStatement) 

PostCoIumnPartition : AppendPat^ (Control, PPIPart) 

The terminal TableSpaceStatement' is optional and can be used to specify the TableSpace in which 
a particular table has to be created. 

PPIPart : PPISetting(gEMPTY, PartitionKeyColPart, PrimarylndexPart) 

The PPISetting is a setting based on the value of the Intermediate Table Index setting. The results of 
the PPISetting are summarized in the following table, which also takes the index creation into 
account. 



Intermediate Table Index 


DBMS 


Partitioning Key 


Create Index 


0 


All 


No 


No 


1 


Teradata 


Yes 


No 


1 


DB2/UDB 


Yes 


Yes 


1 


Other 


No 


No 


2 


Teradata or 
DB2/UDB 


No 


No 


2 


Other 


No 


Yes 



Note that the only part of PartitionKeyColPart & PrimarylndexPart that has to be filled in are the 
column names. This means it makes no difference which pattern is used, as they both can be applied 
on the same arguments. The rest of those two patterns only consist of keywords. The creation of 
create index SQL is discussed in section 4.1. 

PartitionKeyColPart : AppendPat(PartitionPart, KeyPart, IndexColPart) 
PartitionPart : AppendPat(gPARTITIONING gSPACE) 

KeyPart : AppendPat(gKEY, gSPACE) 

The three lines above present the details of the UDB partitioning key generation. The most 
interesting par t is the IndexColPart, which takes care of the generation of the column names, which 
are used in the partitioning key. Subsection 4. 1 describes the exact definition of IndexColPart. 

PrimarylndexPart : AppendPat(PrePrimaryPart, IndexColPart, PostPrirnaryPart) 
PrePrimaryPart : AppenaTat(PrimaryPart, IndexPart) 

PrimaryPart : AppendPat(gPRIMARY, gSPACE) 

The lines above present the details of the primary index generation for Teradata. Again the most 
interesting part is the generation of the column names by IndexColPart described in section 4. 1 . The 
other non-terminals take care of the generation of the necessary keywords. The IndexPart non- 
terminal is described in detail in section 4. 1. 
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4. Index Creation and Partitioning Keys 



This section explains the VSG for the creation of indexes on tables. Subsection 4. 1 presents the 
VSG, which is needed in order to create indexes. 



This section describes the VSG grammar needed in the generation of Create Index syntax. 



IndexCreation consists of three parts. The PrelndexPart generates the SQL before the index key. 
The IndexColPart generates the key upon which the index will be based. The index key is 
influenced by the 'Max Columns in Index' setting. This setting determines the maximum number of 
columns, which could be included in an index or partition key. Finally the PostlndexPart takes care 
of the IndexPostStatement The value for this IndexPostStatement is taken directly from the 'Index 
Post String* setting. 

ColumnCommaAppendPat : (#0#<,#*#>) 

The most interesting pattern of this section is the ColumnCommaAppendPat, which is used in the 
generation of key, upon which the index is based. It start with a left parenthesis followed by each of 
the columnnames, which are if necessary separated by comma's, and finishes with a right 
parenthesis. 

CreatelndexPart : AppendPat(CreatePart, IndexPart) 

IndexPart : AppendPat(gINDEX, gSPACE) 

IndexOnPart : AppendPat(IndexNamePart, OnPart, TTNamePart) 

IndexNamePart : AppendPat(gI_, TTNamePart) 

OnPart : AppendPat(gON, gSPACE) 

The five lines above present the details of the PrelndexPart. All they do is use the AppendPat pattern 
to append various parts. The CreatePart and TTNamePart can be found in section 3.2. 



4.1 



Create Index VSG 



IndexCreation 

PrelndexPart 

IndexColPart 



: AppendPat(PreIndexPart, IndexColPart, PostlndexPart) 

: AppendPat(CreateIndexPart, IndexOnPart) 

: ColumnCommaAppendPat(head(vector<KeyColName>), 

tail(vector<KeyColName>)) 

: AppendPat(gSPACE, IndexPostStatement) 



PostlndexPart 
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5. VSG Implementation Details 

This section gives a short overview of the implementation of Create Table VSG and Create Index 
VSG. Section 5.1 presents the details of the Create Table VSG implementation. The Create Index 
VSG implementation is described in section 5.2. Section 5.2.1 describes CreatelndexColumnsVSG, 
an auxilary method. 



5.1 Create Table VSG 

In this section the CreateStmtVSG method, which implements the Create Table VSG, is discussed. 
The function of this method is to generate create table SQL. It is called from the Parse method in the 
DFCCompUnit class. It generates the column names, types, and possible constraints based on the 
input value of the attribute and fact column names and types and on the constraints input vector. The 
VLDBInfo is used to get the tablesettings, that is the qualifier, descriptor, etc. The VLDBInfo is also 
used to get the Intermediate Table Index setting. This setting is used in the case of Teradata and 
DB2 databases, which need to generate a partitioning key. 

Checks are made to ensure correct input. This entails an equal number of attribute column names 
and types. The same goes for the fact column names and types. Furthermore, no create table SQL 
can be generated if not at least one column is present. Note that a table can exist with only fact 
columns. 



5.2 Create Index VSG 

This section discusses the CreatelndexStmtVSG method, which takes care of the implementation of 
the Create Index VSG. The function of this method is to generate create index SQL. It is called from 
the Parse method in the DFCCompUnit class. Using the names of the attribute columns, the 
VLDBInfo, and the temp table name the create index statement is generated and returned to the 
caller. Note that a check is performed on the size of the attribute column names vector, as an index 
can not be generated without any attribute column names. 



5.2.1 CreatelndexColumnsVSG 

This section discusses the CreatelndexColumnsVSG method. The function of this method is to get 
the names of the attributes, which will be used in the index creation by CreatelndexStmtVSG. This 
depends on the Max_Col_In_Index setting, which is read from the VLDBInfo. This method is 
called by CreatelndexStmtVSG and by CreateStmtVSG. It assumes that it receives a non-empty 
vector of attribute names. Note that this is not checked. 
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